{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Evaluation DUACS OI: \n",
    "\n",
    "This notebook presents the evaluation of the SSH reconstructions based on the DUACS OI ([Taburet et al., 2018](https://os.copernicus.org/articles/15/1207/2019/)) and performed for the **\"2020a_SSH_mapping_NATL60\" ocean data challenge**. \n",
    "\n",
    "Two experiments are analysed: \n",
    "    \n",
    ">   A) **Experiment 1**: DUACS SSH reconstruction with **4 nadir altimeters**\n",
    "\n",
    ">   B) **Experiment 2**: DUACS SSH reconstruction with **1 SWOT + 4 altimeters**\n",
    "\n",
    "The evaluations are based on the methods & metrics described in the \"example_data_eval.ipynb\" notebook"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import xarray as xr\n",
    "import numpy\n",
    "import hvplot.xarray\n",
    "import pyinterp\n",
    "import dask\n",
    "import warnings\n",
    "import xrft\n",
    "import os\n",
    "import sys\n",
    "import pandas as pd\n",
    "import logging\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### libraries versions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('xarray', xr.__version__)\n",
    "print('numpy', numpy.__version__)\n",
    "print('hvplot', hvplot.__version__)\n",
    "print('pyinterp', pyinterp.__version__)\n",
    "print('dask', dask.__version__)\n",
    "print('logging', logging.__version__)\n",
    "print('xrft', xrft.__version__)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "logger = logging.getLogger()\n",
    "logger.setLevel(logging.INFO)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sys.path.append('..')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from src.mod_oi import *\n",
    "from src.mod_inout import *\n",
    "from src.mod_regrid import *\n",
    "from src.mod_eval import *\n",
    "from src.mod_plot import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Read Nature run SSH for mapping evaluation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if not os.path.exists('../dc_ref'):\n",
    "    print('dc_ref folder not found...')\n",
    "    print('download it...')\n",
    "    # Get nature run (it may take several minutes depending on your connection!!!!)\n",
    "    !wget https://ige-meom-opendap.univ-grenoble-alpes.fr/thredds/fileServer/meomopendap/extract/ocean-data-challenges/dc_data1/dc_ref.tar.gz\n",
    "    !tar -xvf dc_ref.tar.gz --directory ../\n",
    "    !rm -f dc_ref.tar.gz\n",
    "    \n",
    "dc_ref = xr.open_mfdataset('../dc_ref/*.nc', combine='nested', concat_dim='time', parallel=True)\n",
    "dc_ref"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Domain for analysis\n",
    "time_min = numpy.datetime64('2012-10-22')                # domain min time\n",
    "time_max = numpy.datetime64('2012-12-02')                # domain max time\n",
    "lon_min = -64.975                                        # domain min lon\n",
    "lon_max = -55.007                                        # domain max lon\n",
    "lat_min = 33.025                                         # domain min lat\n",
    "lat_max = 42.9917                                        # domain max lat"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Select time window sample for evaluation "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dc_ref_sample = dc_ref.sel(time=slice(time_min, time_max), \n",
    "                           lon=slice(lon_min, lon_max), \n",
    "                           lat=slice(lat_min, lat_max)) #.resample(time='1D').mean()\n",
    "del dc_ref\n",
    "dc_ref_sample"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### - Evalutation DUACS OI with 4 nadirs (Experience 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Read DUACS SSH reconstruction\n",
    "!wget https://ige-meom-opendap.univ-grenoble-alpes.fr/thredds/fileServer/meomopendap/extract/ocean-data-challenges/dc_data1/dc_mapping/2020a_SSH_mapping_NATL60_DUACS_en_j1_tpn_g2.nc\n",
    "ds_oi1_grid = xr.open_dataset('2020a_SSH_mapping_NATL60_DUACS_en_j1_tpn_g2.nc')\n",
    "ds_oi1_grid"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "output_directory = '../results/'\n",
    "if not os.path.exists(output_directory):\n",
    "    os.mkdir(output_directory)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "# Regrid    \n",
    "ds_oi1_regrid = oi_regrid(ds_oi1_grid, dc_ref_sample)\n",
    "# Eval\n",
    "rmse_t_oi1, rmse_xy_oi1, leaderboard_nrmse, leaderboard_nrmse_std = rmse_based_scores(ds_oi1_regrid, dc_ref_sample)\n",
    "psd_oi1, leaderboard_psds_score, leaderboard_psdt_score  = psd_based_scores(ds_oi1_regrid, dc_ref_sample)\n",
    "filename_rmse_t = output_directory + 'rmse_t_duacs_ssh_reconstruction_2012-10-22-2012-12-02_jason1_topex-poseidon_interleaved_envisat_geosat2.nc'\n",
    "filename_rmse_xy = output_directory + 'rmse_xy_duacs_ssh_reconstruction_2012-10-22-2012-12-02_jason1_topex-poseidon_interleaved_envisat_geosat2.nc'\n",
    "filename_psd = output_directory + 'psd_duacs_ssh_reconstruction_2012-10-22-2012-12-02_jason1_topex-poseidon_interleaved_envisat_geosat2.nc'\n",
    "filename_dc_ref_sample = output_directory + 'dc_ref_2012-10-22-2012-12-02_sample.nc'\n",
    "filename_oi_regrid = output_directory + 'duacs_ssh_reconstruction_regridded_2012-10-22-2012-12-02_jason1_topex-poseidon_interleaved_envisat_geosat2.nc'\n",
    "# Save results\n",
    "rmse_t_oi1.to_netcdf(filename_rmse_t)\n",
    "rmse_xy_oi1.to_netcdf(filename_rmse_xy)\n",
    "psd_oi1.name = 'psd_score'\n",
    "psd_oi1.to_netcdf(filename_psd)\n",
    "dc_ref_sample.to_netcdf(filename_dc_ref_sample)\n",
    "ds_oi1_regrid.to_netcdf(filename_oi_regrid)\n",
    "# Print leaderboard\n",
    "data = [['duacs 4 nadirs', \n",
    "         leaderboard_nrmse, \n",
    "         leaderboard_nrmse_std, \n",
    "         leaderboard_psds_score, \n",
    "         leaderboard_psdt_score,\n",
    "        'Covariances DUACS',\n",
    "        'eval_duacs.ipynb']]\n",
    "Leaderboard = pd.DataFrame(data, \n",
    "                           columns=['Method', \n",
    "                                    \"µ(RMSE) \", \n",
    "                                    \"σ(RMSE)\", \n",
    "                                    'λx (degree)', \n",
    "                                    'λt (days)', \n",
    "                                   'Notes',\n",
    "                                   'Reference'])\n",
    "print(\"Summary of the leaderboard metrics:\")\n",
    "Leaderboard\n",
    "print(Leaderboard.to_markdown())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### - Evaluation DUACS OI with 1 swot + 4 nadirs (Experience 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Read DUACS SSH reconstruction\n",
    "!wget https://ige-meom-opendap.univ-grenoble-alpes.fr/thredds/fileServer/meomopendap/extract/ocean-data-challenges/dc_data1/dc_mapping/2020a_SSH_mapping_NATL60_DUACS_swot_en_j1_tpn_g2.nc\n",
    "ds_oi2_grid = xr.open_dataset('2020a_SSH_mapping_NATL60_DUACS_swot_en_j1_tpn_g2.nc')\n",
    "ds_oi2_grid"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define outputs\n",
    "output_directory = '../results/'\n",
    "if not os.path.exists(output_directory):\n",
    "    os.mkdir(output_directory)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "# Regrid    \n",
    "ds_oi2_regrid = oi_regrid(ds_oi2_grid, dc_ref_sample)\n",
    "# Eval\n",
    "rmse_t_oi2, rmse_xy_oi2, leaderboard_nrmse, leaderboard_nrmse_std = rmse_based_scores(ds_oi2_regrid, dc_ref_sample)\n",
    "psd_oi2, leaderboard_psds_score, leaderboard_psdt_score  = psd_based_scores(ds_oi2_regrid, dc_ref_sample)\n",
    "filename_rmse_t = output_directory + 'rmse_t_duacs_ssh_reconstruction_2012-10-22-2012-12-02_swot_jason1_topex-poseidon_interleaved_envisat_geosat2.nc'\n",
    "filename_rmse_xy = output_directory + 'rmse_xy_duacs_ssh_reconstruction_2012-10-22-2012-12-02_swot_jason1_topex-poseidon_interleaved_envisat_geosat2.nc'\n",
    "filename_psd = output_directory + 'psd_duacs_ssh_reconstruction_2012-10-22-2012-12-02_swot_jason1_topex-poseidon_interleaved_envisat_geosat2.nc'\n",
    "filename_dc_ref_sample = output_directory + 'dc_ref_2012-10-22-2012-12-02_sample.nc'\n",
    "filename_oi_regrid = output_directory + 'duacs_ssh_reconstruction_regridded_2012-10-22-2012-12-02_swot_jason1_topex-poseidon_interleaved_envisat_geosat2.nc'\n",
    "# Save results\n",
    "rmse_t_oi2.to_netcdf(filename_rmse_t)\n",
    "rmse_xy_oi2.to_netcdf(filename_rmse_xy)\n",
    "psd_oi2.name = 'psd_score'\n",
    "psd_oi2.to_netcdf(filename_psd)\n",
    "dc_ref_sample.to_netcdf(filename_dc_ref_sample)\n",
    "ds_oi2_regrid.to_netcdf(filename_oi_regrid)\n",
    "# Print leaderboard\n",
    "data = [['duacs 1 swot + 4 nadirs', \n",
    "         leaderboard_nrmse, \n",
    "         leaderboard_nrmse_std, \n",
    "         leaderboard_psds_score, \n",
    "         leaderboard_psdt_score,\n",
    "        'Covariances DUACS',\n",
    "        'eval_duacs.ipynb']]\n",
    "Leaderboard = pd.DataFrame(data, \n",
    "                           columns=['Method', \n",
    "                                    \"µ(RMSE) \", \n",
    "                                    \"σ(RMSE)\", \n",
    "                                    'λx (degree)', \n",
    "                                    'λt (days)', \n",
    "                                   'Notes',\n",
    "                                   'Reference'])\n",
    "print(\"Summary of the leaderboard metrics:\")\n",
    "Leaderboard\n",
    "print(Leaderboard.to_markdown())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### - PLOT EVALUATION SCORES"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rmse_concat = xr.concat((rmse_t_oi1, rmse_t_oi2), dim='experiment')\n",
    "rmse_concat['experiment'] = [\"4 nadir\", \"1SWOT + 4 nadirs\"]\n",
    "rmse_concat.hvplot.line(x='time', y='rmse_t', by='experiment', ylim=(0, 1), cmap=['royalblue', 'lightcoral'], title='RMSE-based scores')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The figure above shows the time series of the RMSE scores for the reconstruction of SSH with 1 SWOT+ 4 nadirs and with 1 SWOT. The mean score is similar for each of the experiments. However, the variability of the RMSE score in the 1 SWOT + 4 nadirs reconstructions is slightly higher than in the 4 nadirs reconstruction only. This difference is potentially related to the fact that the estimation of SSH in the DUACS OI is based on a limited number of observations around the \"lon/lat/time\" estimation point and that the 21-day SWOT repetitivity modulates this variability locally more strongly than in the 4 nadirs case. A solution to reduce this variability in the 1 SWOT + 4 nadirs would be to tolerate a larger number of observations in the DUACS OI inversion. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rmse_xy_concat = xr.concat((rmse_xy_oi1, rmse_xy_oi2), dim='experiment')\n",
    "rmse_xy_concat['experiment'] = [\"4 nadirs\", \"1 SWOT + 4 nadirs\"]\n",
    "rmse_xy_concat.hvplot.contourf(x='lon', y='lat', levels=list(numpy.arange(0.,0.75, 0.05)), height=300, width=400, cmap='Reds', subplots=True, by='experiment', clabel='RMSE[m]')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "psd_concat = xr.concat((psd_oi1, psd_oi2), dim='experiment')\n",
    "psd_concat['experiment'] = [\"4 nadirs\", \"1 SWOT + 4 nadirs\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_psd_score_v0(psd_concat)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The PSD-based score evaluates the spatio-temporal scales resolved in mapping (green area). Resolution limits can be defined as the contour where the PSD score = 0.5, black contour in the figure (i.e. space-time scales where the reconstruction SSH error level is 2 times lower than the real SSH signal). The figure above illustrates the spatio-temporal scales solved in each experiment 4 nadirs and 4 nadirs + 1 SWOT. It shows the slight increase in resolution capability from 4 nadirs altimeters to 4 nadirs + 1 SWOT altimeters for spatial wavelengths below 2 degrees. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
